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^— — ^ ^ Abstract. This paper analyzes the scaling window of a random CSP model 

, (i.e. model RB) for which we can identify the threshold points exactly, denoted 

by r cr or p cr . For this model, we establish the scaling window W(n,8) = 
(r_(n,<5), r_|_(n, 5)) such that the probability of a random instance being 
£h ' satisfiable is greater than 1 — 5 for r < r—(n,8) and is less than <5 for r > 

, r_(-(n, 5). Specifically, we obtain the following result 

™» ■ 11 

■ VK(n,5) = (r cr -6( ), rer + e(— — )), 

i n 1 e In ?i n In n 

where < e < 1 is a constant. A similar result with respect to the other 

parameter p is also obtained. Since the instances generated by model RB 

Uhave been shown to be hard at the threshold, this is the first attempt, as far 
as we know, to analyze the scaling window of such a model with hard instances. 
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1. Introduction 

The Constraint Satisfaction Problem (CSP), originated from artificial intelli- 
gence, has become an important and active field of statistical physics, informa- 
tion theory and computer science. The CSP area is very interdisciplinary, since 
^C) • it embeds ideas from many research fields, like artificial intelligence, databases, 

programming languages and operation research. A constraint satisfaction problem 
■ consists of a finite set U — {u±,U2, • • ■ ,U n } of n variables, each iij associated with 

a domain of values Di, and a set of constraints. Each of the constraints Ci 1 i 2 ...i k is 
a relation, defined on some subset {u it , u i2 , • • • , u ifc } of n variables, called its scope, 
denoting their legal tuples of values. A solution to a CSP is an assignment of a 
value to each variable from its domain such that all the constraints of this CSP are 
rS \ satisfied. A constraint is said to be satisfied if the tuple of values assigned to the 

variables in this constraint is a legal one. A CSP is called satisfiable if and only if 
it has at least one solution. The task of a CSP is to find a solution or to prove that 
no solution exists. 

Given a CSP, we are interested in polynomial-time algorithms, that is, algo- 
rithms whose running time is bounded by a polynomial in the number of variables. 
Cook's Theorem[2] asserts that satisfiability is NP-complete and at least as hard 
as any problem whose solutions can be verified in polynomial time. Most of the 
interesting CSPs are NP-complete problems. We know that fc-SAT problem is a 
canonical version of the CSPs, in which variables can be assigned the value True 
or False(called Boolean variables). A lot of efforts have been devoted to fc-SAT 
and it is widely believed that no efficient algorithm exists for fc-SAT. However, 
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it is shown that most instances of fc-SAT can be solved efficiently, so perhaps 
genuine hardness is only present in a tiny fraction of all instances. In 1990s, a 
remarkable progress [31 Q31 HU H2] was made that the the really difficult instances 
is related to phase transition phenomenon, as suggested in the pioneering work of 
Fu and Anderson[6]. The study of phase transitions has attracted much interest 
subsequently O [12] . 

In recent years, random fc-SAT has been well studied both from theoretical and 
algorithmic point of views. If k = 2 then it is known that there is a satisfiabil- 
ity threshold at a c — 1 (here a represents the ratio of clauses m to variables n), 
below which the probability of a random instance being satisfiable tends to 1 and 
above which it tends to as n approaches infinity [4]. This was sharpened in[8lll4j. 
Random 2-SAT is now pretty much understood. However, for fc > 3, the existence 
of the phase transition phenomenon has not been established, not even the exact 
value of the threshold point [Tl [TU], 

To gain a better understanding of how the phase transition scales with problem 
size, the finite-size scaling method has been introduced from statistical mechanics 
[7J . We use finite-size scaling, a method from statistical physics in which observing 
how the width of a transition narrows with increasing sample size gives direct ev- 
idence for critical behavior at a phase transition. Finite-size scaling is the study 
of changes in the transition behavior due to finite-size effects, in particular, broad- 
ening of the transition region for finite n. More precisely, for < 6 < 1, let 
r_ (n, 8) be the supremum over r such that the probability of a random CSP in- 
stance being satisfiable is at least 1 — 8, and similarly, let r+(n,8) be the infimum 
over r such that the probability of a random CSP instance being satisfiable is at 
most 8. Then, for r within the scaling window W{n,8) — (r_(n,<5), r+(n,8)) 
the probability is between 8 and 1 — 8. And for all 8, \r + (n, 8) — r_(n, <5)| — > as 
n — » oo. For random 2-SAT, it has been determined that the scaling window is 

w(n, S) = (i- ein- 1 / 3 ), i + e^- 1 ^))^. 

Model RB is a random CSP model proposed by Xu and Li to overcome the 
trivial insolubility of standard CSP models [IB]. For this model, we can not only 
establish the existence of phase transitions, but also pinpoint the threshold points 
exactly, denoted by r cr or p cr . Moreover, it has been proved that almost all in- 
stances of model RB have no tree-like resolution proofs of less than exponential size 
[16j . This implies that unlike random 2-SAT, model RB can be used to generate 
hard instances, which has also been confirmed by experiments]?]. Motivated by the 
work on the scaling window of random 2-SAT, in this paper, we study the scaling 
window of model RB and obtain that W(n, 8) = (r cr — Q( \ nn ), r cr + Q ( „ r n ) ) • 
And we also obtain similar results about the other control parameter p. 

The main contribution of this paper is not to present new methods for computing 
the scaling window, but to show that for an interesting model with hard instances 
(i.e. model RB), not only can the threshold points be located exactly, but also 
the scaling window can be deteremined using standard methods. This means that 
hopefully, more mathematical properties about the threshold behavior of model 
RB can be obtained in a relatively easy way, which will help to shed light on the 
phase transition phenomenon in NP-complete problems. The rest of the paper is 
organized as follows. In the next section, we will give a brief introduction about 
model RB. The main results of this paper and their proofs will be given in Section 
3 and Section 4 respectively. Finally, we will conclude in Section 5. 
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2. Model RB 

We can pinpoint the threshold location for model RB proposed by Xu and Li 16J. 
The way of generating random instances for model RB is: 

(1) . Given a set U of n variables, select with repetition m = rnhin random con- 
straints. Each random constraint is formed by selecting without repetition k of n 
variables, where k > 2 is an integer. 

(2) . Next, for each constraint we select uniformly at random without repetition 
q = p ■ d k illegal tuples of values, i.e., each constraint contains exactly (1 — p) ■ d k 
legal ones, where d = n a is the domain size of each variable and a > is a constant. 

In this paper, the probability of a random CSP instance being satisfiable is 
denoted by Pr(Sat). It is proved that for model RB the phase transition phenom- 
enon occurs at r cr = — ln (°_ p ) or p cr = 1 — e~~ as n approaches infinity [T^j. More 
precisely, we have the following two theorems. 

Theorem 2.1 16J Let r cr = - ^J". . If a > \, < p < 1 are two constants 
and k, p satisfy the inequality k > j^—, then 

lim Pr(Sat) = 1 when r < r cr , 

n — >oo 

lim Pr(Sat) = when r > r cr . 

n — kx) 

Theorem 2.2 16] Let p cr — 1 — e~~ . If a > j, r > are two constants and k, a 
satisfy the inequality ke~~ > 1, then 

lim Pr(Sat) = 1 when p < p cr , 

n — >oo 

lim Pr(Sat) = when p > p cr . 

n — >oo 

3. Main results 
Our main results are the following two theorems. 

Theorem 3.1 For all sufficiently small S > 0, there exist r_(n, S) and r + (n,5) 
such that the following holds: 

Pr(Sat) > 1 — 5, when r < r_(n,<5); 
Pr(Sat) < 5, when r>r + (n,6), 

where r_(n,<J) = r cr - Q( n i-e lnn ), r + (n,5) = r cr + 6(^^). So that the scaling 
window of model RB is 

W(n,S) = (r cr - 6( * ), r cr + 9(^— )). 

n L e mn nmn 

It is easy to see that \r + (n,5) — r^(n,5)\ — * 0, as n — * oo. 

Theorem 3.2 For all sufficiently small 6 > 0, there exist p~(n, S) and p+(n,5) 
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such that the following holds: 

Pr(Sat) > 1 — (5, when p < p-(n,8); 

Pr(Sat) < 8, when p>p + (n,8), 

where p-(n,8) = p cr - Q( w i-* ln „ ), p+(n,8) = p cr + 6(^7^)- So that the scaling 
window of Model RB is 

W(n,8) = (p cr - Q( x _ l ), p cr + Q(-}—)). 

n 1 6 inn nmn 

It is not difficult to see that \p+(n, 8) — p~(n, 8)\ — > 0, as n — > 00. 

Remark 3.1 If n — > 00, then r+(n, 8), r_(n, 8) —> r cr , p+(n,8),p—(n,8) — > p cr . 
For every sufficiently small 8, Theorem 3.1 and Theorem 3.2 hold. So we can 
obtain 

lim Pr(Sat) = 1 when r < r cr or p < p cri 

n — >oo 

lim Pr(Sat) = when r > r cr or p > p cr . 

n — >oo 

This is the result of Xu and Li[16j. 

4. Proof of the results 
To prove the main results, we need the following lemmas. 



Lemma 4.1 Let c = a + 1 — r cr kp, then c < 1. 

a hp 



Proof We know that r cr — — ln( - °_ p ) 1 then 



c = a + 1 
= 1 



ln(l-p) 
a[kp + ln(l — p)] 



ln(l-p) 

Assume that f(p) = fcp + ln(l — p), hence we have f'{p) = — + k. 

By the condition of Theorem 2.1, we have k > jz^, hence f'(p) > 0. That is 
f(p) is a monotone increasing function. 

So f(p) > /(0), that is kp + ln(l - p) > 0. It is obvious that ln(l - p) < 
because of < p < 1. And a > is a constant. 



Hence °[ fc P+ ln (i-p)] < 
nence i n (i- P ) ^ u - 



Therefore, it is proved that c = 1 + "^'"-ri ^ < 1. 
Lemma 4.2 Let c = a + 1 — rkp cr , then c < 1. 
Proof We know that p cr = 1 — e~^ , so 

c = a + 1 — rfc(l — e _ ~) 

= i_r[-- + fc(l-e-*)] 
r 

Let — — = £, then x € (— 00, 0). Suppose /i(.t) = .t + fc(l — e x ), then /i'(;r) 
1 - ke x . T 
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By the condition of Theorem 2.2, ke x = ke~^ > 1, hence h'(x) < 0. That is 
h(x) is a monotone decreasing function. 

So h(x) > h(0), that is h{x) > 0. And r > is a constant, hence it is proved 
that c = 1 - r[-f + fc(l - e~r )] < 1. 

Proof of Theorem 3.1 Let N denote the number of satisfying assignments for a 
random CSP instance, we can obtain that 

E{N) = d n {l-p) rnlan 

(4.1) = n an (l-p) rnlnn 
Assume that E(N) < S, by (1) we get 

(4.2) [a + rln(l - p)]n\nn < In (5 

In 5 



Inn 

In 5 hid 



(4.3) a + rln(l-p)< 

Ce 

ln(l — p) nlnnln(l— p) c ' ' nlnrtln(l— p) 
Using the Markov inequality Pr(Sat)< E(N), we get Pr(Sat)< S for 

(4.5) r>r cr + e(— —). 

nmn 

Here note that / = <d(g) represents there exist two finite constants C\ > and 
c 2 > such that c\ < f / g < c 2 . 

In the following, we use Cauchy inequality Pr(Sat)> wjyrj to prove when 
r <r cr + 9(^^j), we have Pr(Sat)> 1 - S. 

In the remaining part of the paper, the expression of E(N 2 ) will play an impor- 
tant role in the proof of the main results. The derivation of this expression can be 
found in [16] . For the convenience of the reader, we give an outline of it as follows. 
Definition 4.1 Let (tj, tj) represents an ordered assignment pair to the n variables 
in U, which satisfies a CSP instance if and only if both ij and tj satisfy the CSP in- 
stance. And P((ti,tj)) denotes the probability of (U,tj) satisfying a CSP instance. 
Definition 4.2 The similarity number S of an assignment pair (ti,tj) is the num- 
ber of variables U and tj take the identical values. It is obvious that < S < n, 
and let s = — . Let As be the set of assignments whose similarity number is equal 
to S. 

We can get the expression of E(N 2 ) is 

n 

\A 3 \P((U,tj)) = Y,\As\P((ti,tj)) 



s=o 



\ K" 1 ) ( S \ ( d "- 2 ) (S\ 

n \(rl ~\\ n ~ S \ - q \fe/ , V 1 I fl _ U/ \lrnlnii 



q ) V q , 

T'2\ 



First we need to estimate E(N ). We can rewrite the above equation as the 
following one 

lAslPd^tj)) = E 2 (N)[1 + J ^(s k + S&)}™^. 

1 • ' (i-Fr(^i:)(i+oe)) 
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where g(s) = k( * k — 



2 

2/ 



When n is sufficiently large, except E (N), the dominant contribution to (4.6) 
comes from 



f(s) = (i + ^^ s k y nlan (— 

w v 1 -p n a ' 



(4.7) 



\r ln(l+ 1 P p s k ) — as]n In n 



We put h(s) = rln(l + jz^s fc ) — as and focus on the function h(s), differentiating 
h(s) twice with respect to s we get 

(4.8) h"(s) = rk PS ^[(k-l)(l-p)-p S k ] 

(i — p + ps k y 

Applying the condition k > jz^, we get (k — 1)(1 — p) — ps k > on the interval 
[0, 1], then h"(s) > 0. So h(s) is a convex function. It is easy to see that h(Q) = 
and h(l) = — r ln(l —p) — a. So when r < r cr — G(l/(n 1_e Inn)), we have h(l) < 0. 
On the interval < s < 1, we get /i(s) < 0. So there exist < <5i < 1 and < 82 < 1 
such that when r < r cr — 0(l/(n 1_e Inn)), /i(s) is mainly decided by the values 
s£ [0, <5i] U [1 — <?2, 1] ■ So we only need to consider those terms s £ [0, Si] U [1 — 62, 1] 
to estimate (4.6). This is different from the proof in Xu and Li[16] for establishing 
the existence of phase transitions, where only those terms s € [0, Si] were considered. 

(i) *e [0,<y 

We can learn from Xu and Li [16] that 



(4.9) ]T \A s \P({U,t 3 ))<E\N){l + 0{±)) 
se[o,<5i] 

(ii) SG [1-52,1] 

It is easily known that if s e [1 — 52,1], we can obtain s k — s k ~ 1 < 0, thus 
g(s) — fc ( fc ~ 1 )^ ~ s 1 < 0. So we can get the following inequality 

\As\P{{U,t 3 )) < E 2 (N)(l + -^s k ) rnlnn 

1-p 

•(1 - — )(™-" s )(— ) ns ( n Vi + 0{- j) 

(4.10) = E(N)(i- P +ps k y nlnn (n a -iy- ns ( n )(l + 0(-)) 

\ns J n 

When ,s = 1(S = n), we obtain 

(4.11) \A s \P((t l ,ty)=E(N)(l + 0(k): 
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When s = = n — t), where l<i«u. We can get that 

\A s \P{{U,t 3 )) < E(N). [l-p + pC^n^-lf^-il + O^)) 

< £(7v) e -^-(^) fe r ln " (n a_i)tQ. ( i + C)( I)) 

n (a+l)t I 

n (a+l)t ! 

a+ l +0 (i) 1 
(4-12) < E(N)( ^ )«(1 + Q(-)) 

When n is sufficiently large, let c — a + 1 — r cr kp — a + 1 + j^~~y- Thus it is 
divided into two cases to discuss the value of c. 
Case 1: c < 0. 

When s = by (4.12) we can obtain 

(4.13) \A s \P((U,tj)) < E(N) ■ n c ■ (1 + O(^)) 
When s = 2=2, by (4.12) we have 

(4.14) |yl s |P((t l ,^))<i?(7V).n 2c .(l + 0(i)) 

So we can get 

E < i?(iV)(l + n 2 +n 2c + ...)-(l + 0(i)) 

«e[i-« 2 ,i] 

(4.15) = E(N)(l + 0(n c )) 
It is shown from (i) and (ii) that 

n 

E(N 2 ) = ^AsWfatj)) 
s=o 

se[o,«i] se[i-« 2 ,i] 

(4.16) < £ 2 (iV)(l + O(i)) + £(A0(l + O(n c )) 

n 

Consequently, by the Cauchy inequality, we have 



£(7V 2 ) " S 2 (A^)(l + 0(i)) + S(iV)(l + 0(n c )) 
(4.17) > 1-6 

Putting - = i?, hence we have 

(4.19) anlnn + rnlnnln(l — p) > In?? 
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(4.20) 



r < 



In d — an In n 



+ 



So we obtain that 
(4.21) 



nlnnln(l — p) ln(l — p) nlnnln(l — p) 
In i? 



r < r cr + 



nlnnln(l — p) 

Thus when r < r cr + 6( wl ^ n ) , we have the result Pr(Sat) > 1 — 5. 
Case 2: c > 0. 

When 1 < t <C n, by the right side of (4.10), we can get 



[I-P + Pil- 1 -)] 



rn In n / m a 



(n Q - iy 



= n 



(4.22) 



\/2wn ■ n ^ a+1+ °^^ rk ^ t 1 



Now when n is sufficiently large, let u t 



(a + l-rfcp)t 



„ci In n— i In t 



V' Thcn u * = 

. If we put u t = ct Inn— tint, we can get = clnn— hit— 1, then ui' t = 
when t = And it is known that < c < 1 by Lemma 4.1. So |.<4s|P((tj, tj)) has 
the maximal value \[2wn ■ e 2 ^" at the point of t = — . So we can have 

(4.23) ^ |A s |P«ti,tj))<£7(iV)v&-e^n(l + 0(i)) 

«e[i-«2,i] 
We use the Cauchy inequality 



(4.24) 
(4.25) 



(4.26) 



Pr(Sat) > 



E 2 (N) 



> 



E 2 (N) 



E{N 2 ) (E 2 (N)+E(N)V^n~-e^n){l + 0{±)) 
> 1-S 

, s 1-5 + 0(1) 

*W > j_ 0( i) ^ • e^n 



Let ^ ( /_"o^ ( " )} = A, then we get 



n c 3 

an In n + rn In n ln(l — p) > In A H h — In n 



r < 



-an In n + ^ + | In n + In A 



(4.27) 



nlnnln(l — p) 
1 



In A 



en 1 c lnnln(l— p) 2nln(l— p) nlnnln(l— p) 



So when r < r cr + 0( „i-c ln „ )> we have Pr(Sat) > I — 5. 

Combining the above cases, it is proved that the scaling window of model RB is 

W(n,S) = (r cr - 9( 1 ), r cr + 6(-^— )), 
n 1 e lnn nlnn 

where e = c < 1 and it is obvious that \r cr +Q(-^ l )-(r cr -Q( ni J lnn ))\ -» 

(n — » oo). Thus, we finish the proof of Theorem 3.1. 
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Remark 4.1 By Lemma 4.1, we claim that c increases with p and decreases with 
a. Therefore, when < c < 1, the convergence rate of r_(n, S) approaching r cr 
decreases with p and increases with a. 

Proof of Theorem 3.2 Similarly, we can also use (4.3) to obtain that 

1 X 

(4.28) ln(l-p)<-- + 



r rnlnn 



p > 1-e - + ™ln™ 

= 1 - e~r + e~r (1 - e^T^) 

= p cr + e -f[l-(l + 0(J^_))] 

rn Inn 

(4.29) = p cr + 6(-^) 

nmn 

So when p > p cr + Q( „i nn ); we have Pr(Sat) < S. 

Similar to the proof of Theorem 3.1, when n is sufficiently large, let c = a + 1 — 
rkp cr . So by Lemma 4.2 we can also divide c into two cases, that is to say c < 
and < c < 1. Therefore, we have the followings. 

By (4.19), we can get 

,, „„. , . . In z9 — an Inn a lni? 

(4.30) ln(l - p) > = + 



rnlnn r rn In n 



p < 1 — e >• + ' 



In 



= 1 — e r + e * (1 — e 

= Pcr + e -f[i-(i + o(J^L))] 

rn Inn 

(4.31) = p cr -6(— |— ) 

rn m n 

By (4.26), we have 

n c 3 

(4.32) an In n + rn In n ln(l — p) > In A H h - In n 

a , + § Inn + lnA 

p < 1-e ~ + ^^^n^ 

= 1 — e r + e r (1 — e '»'»» ) 

- Per + e -f[l-(l + 0( 1 ))] 

n L c \nn 

(4-33) = p cr -Q( * ) 

n 1 c lnn 

Thus the results are as follows: 

Pr(Sat) > 1 — 5, when p < p cr — @ ( — = ): 

n 1 £ lnn 

Pr(Sat) < S, when p > p cr + ©(— p — ), 

nlnn 
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g±|g] 

2 : 

Therefore the scaling window of model RB with respect to parameter p is 



where e = c < 1 and < e < 1. 



W(M) = (Pcr-9( ^ ), p cr + e(-^)) 
n 1 e in n n in n 

Remark 4.2 Similar to Remark 4.1, by Lemma 4.2, we obtain that the convergence 
rate of p—(n,8) approaching p cr increases with both r and a. 

Note that especially, when n — ► oo, we have 

Pr(Sat) — > 0, when r > r cr or p > p cr , 
Pr(Sat) — > 1, when r < r cr or p < p cr . 
This is the result of Xu and Li[16j. 

5. Conclusions 

In this paper, we obtain the scaling window of model RB for which the phase 
transition point is known exactly. As mentioned before, the scaling window of ran- 
dom 2-SAT has also been determined. However, this model is easy to solve because 
2-SAT is in P class. Recently, both theoretical [T7] and experimental results [T5] 
suggest that model RB is abundant with hard instances which are useful both for 
evaluating the performance of algorithms and for understanding the nature of hard 
problems. As far as we know, this paper is the first study on the scaling window 
of such a model with hard instances. We hope that it can help us to gain a better 
understanding of the phase transition phenomenon in NP-complete problems. 
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